Phase diagrams for the spatial public goods game with pool-punishment 
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The efficiency of institutionalized punishment is studied by evaluating the stationary states in the spatial 
public goods game comprising unconditional defectors, cooperators, and cooperating pool-punishers as the three 
competing strategies. Fine and cost of pool-punishment are considered as the two main parameters determining 
the stationary distributions of strategies on the square lattice. Each player collects its payoff from five five-person 
public goods games, and the evolution of strategies is subsequently governed by imitation based on pairwise 
comparisons at a low level of noise. The impact of pool-punishment on the evolution of cooperation in structured 
populations is significantly different from that reported previously for peer-punishment. Representative phase 
diagrams reveal remarkably rich behavior, depending also on the value of the synergy factor that characterizes 
the efficiency of investments payed into the common pool. Besides traditional single and two-strategy stationary 
states, a rock-paper-scissors type cyclic dominance can emerge in strikingly different ways. 

PACS numbers: 89.65.-.S. 89.75.Fb, 87.23.Kg 



I. INTRODUCTION 

The importance of punishment for the maintenance of co- 
operative behavior in human societies can be quantified by 
studying spatial public goods games (PGG) with players 
forming overlapping groups. For the simple two-strategy case 
players (within all the groups) decide simultaneously whether 
they wish to contribute to the common pool (cooperate) or 
not (defect). Subsequently, the multiplied total investment is 
divided equally among all the group members irrespectively 
of their initial decision. In this situation the rational (selfish) 
players should decline to contribute if the investment costs ex- 
ceed the return of the game 1 1, 2]. As a result, selfish players 
fail to benefit from mutual cooperation and the society evolves 
towards the "tragedy of the commons" f^. Human experi- 
ments and mathematical models alike have shown, however, 
that cooperative behavior can be promoted by punishing de- 
fectors for a wide class of social dilemmas, including the pris- 
oner's dilemma game. In fact, it can be stated that some ele- 
ments of punishment can be recognized within all the relevant 
mechanisms |i4J supporting cooperation among selfish indi- 
viduals inm. 

Traditionally, the sanctions foreseen by punishment are 
considered to be costly. While those that are punished bear 
a fine, the punishers must bear the cost of punishment. Both 
fine and cost may substantially reduce the overall income of 
the corresponding players. There are, however, different ways 
of how the income reduction is executed that depend on the 
governing evolutionary rules, the set of strategies, as well as 
on the network structure and group formation, among others. 
Many aspects of punishment were already investigated by ex- 
periments 1I9I 4I7I1. as well as by means of mathematical mod- 
els with three III8I - I21I1 . four ll22ll23ll . and even more strategies 

imin. 

Here we study the effects of pool-punishment in the spa- 
tial PGG and contrast the results with those reported previ- 
ously for peer-punishment fld^T^. Pool-punishment is syn- 
onymous to institutionalized punishment, where the contribu- 
tions of punishers are meant to cover the costs of institutions 
like the police or other elements of the justice system indepen- 



dently of their necessity or efficiency |29|. On the contrary, by 
peer-punishment 1 10, 30] the punishers pay the cost of punish- 
ment only if it is necessary, i.e. when the defectors are iden- 
tified and sanctioned. In the absence of defectors the income 
of peer-punishers is therefore equivalent to that of pure co- 
operators, who refuse to bear the cost of punishment and are 
thus frequently refeiTed to as the "second order free-iiders". 
On the other hand, because of their permanent contributions 
to the punishment pool the income of pool-punishers is al- 
ways smaller than that of cooperators. A preceding study on 
pool -punishment in well-mixed populations [29] concluded 
that pool-punishers can prevail over peer-punishers only if the 
second-order free-riders are punished as well. We will show 
that in structured populations self-organizing spatiotemporal 
structures can maintain pool-punishment viable without such 
an assumption. Indeed, the phase diagrams for three repre- 
sentative values of the multiplication parameter at a low level 
of noise indicate surprisingly rich behavior depending on the 
punishment fine and cost. 



II. SPATIAL PUBLIC GOODS GAME WITH 
POOL-PUNISHMENT 

The PGG is staged on a square lattice with periodic bound- 
ary conditions. The players are arranged into overlapping five- 
person (G = 5) groups in a way such that the focal players 
are surrounded by their four nearest neighbors each. Accord- 
ingly, each individual belongs to G = 5 different groups. All 
the players thus play five five-person PGGs by following the 
same strategy in every group they are affiliated with. Initially 
each player on site x is designated either as a pool-punisher 
{sx = O), cooperator (s^ — G), or defector (sx = D) 
with equal probability. Using standard parametrization, the 
two cooperating strategies O and G contribute a fixed amount 
(here considered being equal to 1 without loss of generality) to 
the public good while defectors contiibute nothing. The sum 
of all contributions in each group is multiplied by the factor 
1 < r < G, reflecting the synergetic effects of cooperation, 
and the resulting amount is then equally divided among all the 
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group members irrespective of their strategies. 

Pool-punishment requires precursive allocation of re- 
sources and therefore each punisher contributes an amount 
7 to the punishment pool irrespective of the strategies in its 
neighborhood. Defectors, on the other hand, must bear the 
punishment fine /3, but only if there is at least one pool- 
punisher present in the group. Denoting the number of coop- 
erators (C), pool-punishers (O) and defectors (D) in a given 
group g by N^, Nq and Nfy, respectively, the payoffs 

= r{Nl,+N^)/G-l, 

Po = (1) 

Pi - r(7V^+iVS)/G-/J/(7Vg) 

are obtained by each player x depending on its strategy Sx, 
where the step-like function f{Z) is 1 if Z > and other- 
wise. 

By Monte Carlo (MC) simulations the system is started 
from a random initial strategy distribution, and its evolution 
is subsequently controlled by the more realistic random se- 
quential strategy updates [31]. During these elementary pro- 
cesses a randomly selected player x plays the public goods 
game with its interaction partners as a member of all the 
g = 1,...,G groups, whereby its overall payoff is thus 
Ps^ " X)g ^sL ■ Next, player x chooses one of its four nearest 
neighbors at random, and the chosen co-player y also acquires 
its payoff Pg^ in the same way. Finally, player x imitates 
the strategy of player y with a probability w(sx — J> Sy) — 
1/{1 + exp[(Ps^ — Psy)/K]}, where K quantifies the un- 
certainty in strategy adoptions (3^. Without loss of gener- 
ality we set K = 0.5, thereby allowing also direct compar- 
isons with previous results obtained for the same level of noise 
in,!!!]. Each Monte Carlo step (MCS) (interpreted as a unit 
of time) gives a chance for the players to adopt a strategy from 
one of their neighbors once on average. The average frequen- 
cies of pool-punishers (po), cooperators (pc) and defectors 
(Pd) on the square lattice are determined in the stationary state 
after a sufficiently long relaxation time i,.. Depending on the 
actual conditions (proximity to phase transition points and the 
typical size of emerging spatial patterns) the linear system size 
was varied from L = 200 to 5000, and both the relaxation (t^) 
and the sampling (t^) time were varied from tr — tg = 10^ 
to 10^ MCS to ensure that the statistical error is comparable 
with the line thickness in the plots. 

The first study of the spatial two-strategy {D and C) evo- 
lutionary prisoner's dilemma games (PDGs) indicated that the 
survival of cooperators is supported if they form compact clus- 
ters lHH]. Similar phenomena were subsequently reported for 
spatial evolutionary PGGs fio', '34*] . On the contrary, the sur- 
vival of defectors is enhanced if they are distributed sparsely. 
Quantitative analyses have revealed that cooperators and de- 
fectors coexist in the stationary state if rd <r < (hence- 
forth this state will be denoted as DC), where the two thresh- 
old values depend on the connectivity structure (including the 
group size G) and the noise level. Below (above) the borders 
of the coexistence phase only defectors (cooperators) remain 
alive, while within the DC region pc increases monotonously 
from to 1. It turned out, furthermore, that for the spatial 



PGG the extension of the coexistence region (rc2 — t'ci) re- 
mains finite in the zero noise limit for all the previously stud- 
ied connectivity structures j35|. This is in sharp contrast with 
the results obtained for spatial PDGs, where — rd — >^ in 
the A' — > limit for several connectivity structures {e.g. on 
the square lattice) lf36l IjtIi . Consequently, in our simulations 
the noise level K = Q.h yields a typical low noise behavior 
with a sufficiently fast relaxation towards the final stationary 
state. 

Evidently, if only pure cooperators and pool-punishers are 
initially present in the system, then all the pool-punishers 
will eventually die out because of their lower payoff. At the 
same time, analogous with the coexistence of the D and G 
strategies, the D and O players can also coexist in the so- 
called DO phase that is bounded to a synergy factor region 
< r < r^2' where the two threshold values are affected 
not only by the connectivity structure and the noise level, but 
also by the values of the punishment fine j3 and cost 7. 

We emphasize that the homogeneous one-strategy solu- 
tions, denoted henceforth as C, D and O phases, are absorbing 
states because the applied dynamical rule leaves these states 
unchanged once the system arrives there. Due to the analogy 
between the presently applied imitation rule and the spread- 
ing of infections, simplified by the contact process [38], it is 
expected that upon varying one of the parameters the above- 
mentioned continuous phase transitions from a two-strategy 
state to one of the homogeneous phases will belong to the 
directed percolation universality class^ Up^ to now this was 
confirmed only for the spatial PDGs lf32l f39ll (for further ref- 
erences see [40]). In the following sections the power law be- 
havior of the extinction process is verified only for a few cases 
because of the huge computational capacity that is required 
for this. Similar critical transitions can also be observed when 
a three-strategy state transforms into a two-strategy state by 
varying a control parameter Such behavior was already ob- 
served previously in a spatial evolutionary PGG with volun- 
teering [lli. 

In the majority of spatial systems the three-strategy states 
are maintained by cyclic dominance among the three strate- 
gies. Examples include the PGG tH [H and the PDG with 
voluntary participation ["iT], as well as other three-strategy 
{e.g. cooperation, defection and tit-for-tat) variants of the 
PDG 142144511 ■ In the spatial PGG with pool-punishment, the 
cooperators can invade the territory of punishers, the punish- 
ers can occupy the sites of neighboring defectors, while de- 
fectors may outperform cooperators within a wide range of 
parameters. We find that this rock-paper-scissors type cyclic 
dominance yields a self-organizing pattern, which we will 
henceforth denote by (Dh-Ch-O)c. We emphasize that an analo- 
gous three-strategy phase governed by cyclic dominance can- 
not be observed if peer-punishment is considered, because 
there, in the spatial mixture of cooperators and peer-punishers, 
both types of players receive the same payoff, and further- 
more, due to the random imitation the evolution of the system 
becomes equivalent to that of the voter model [38]. Interest- 
ingly though, the extinction of free-rider pure cooperators can 
be catalyzed efficiently by adding defectors via rare random 
mutations [28il. The description and notation of additional 
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FIG. 1 : (Color online) Evolution of strategy distribution for three different initial states. Upper row shows the evolution from random initial 
state while middle row shows the evolution from a prepared state. The bottom row demonstrates the time evolution when the final states of 
upper evolutions meet. We used identical parameters for all cases, namely L = 390, r — 2.0, 7 = 0.1, /? = 0.79 and K — 0.5. Black, white, 
and blue (grey in black&white print) denotes players with defector, pure cooperator, and pool-punisher strategy, respectively. 



phases including the governing phase transitions will be given 
below at the place of their occurrence. 

As a general comment for the simulation difficulties of spa- 
tial system of cyclic dominant species, we must highlight the 
potential problem originated form small system size. If the 
system size is not large enough then the simulations can result 
one- and/or two-strategy solutions that are unstable against 
the introduction of a group of mutants. For example, the ho- 
mogeneous C or O phases can be invaded completely by the 
offspring of a single defector inserted into the system at suf- 
ficiently low values of r. On the contrary, the D phase can 
be fully occupied by a single group of pool-punishers (or co- 
operators) if initially they form a sufficiently large compact 



cluster (e.g. a rectangular box). In such cases the competi- 
tion between two homogeneous phases can be characterized 
by the average velocity of the invasion fronts separating the 
two spatial solutions characterized by a proper composition 
and spatiotemporal structure. Generally, the same method can 
also be used to determine the winner between any two possi- 
ble spatial solutions. 

For the considered imitation rule a system with three (or 
more) strategies has a large number of possible solutions be- 
cause all the solutions of each subsystem (comprising only a 
subset of all the original strategies) are also solutions of the 
whole system |40]. In such situations the most stable solution 
can be deduced by performing a systematic check of stabil- 
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ity (direction of invasion) between all the possible pairs of 
subsystem solutions that are separated by an interface in the 
spatial system. Fortunately, this analysis can be performed si- 
multaneously if we choose a suitable patchy structure of sub- 
system solutions where all the possible interfaces are present. 
The whole grid is then divided into several large rectangu- 
lar boxes with different initial strategy distributions (contain- 
ing one, two or three strategies), and the strategy adoptions 
across the interfaces are initially forbidden for a sufficiently 
long initialization period. By using this approach one can 
avoid the difficulties associated either with the fast transients 
from a random initial state or with the different time scales 
that characterize the formation of possible subsystem solu- 
tions. On the contrary, it is easy to see that a random initial 
state may not necessarily offer equal chances for every solu- 
tion to emerge. Evidently, if the system size is large enough 
then these solutions can form locally and the most stable one 
can subsequently invade the whole system. At small system 
sizes, however, only those solutions can evolve whose charac- 
teristic formation times are short enough. 

To illustrate the possible problem of random initial states 
when using small system sizes, we compare the time evo- 
lution of strategy distributions for different initial states in 
Fig.[T] For appropriate comparisons, naturally, we have used 
identical model parameters for all cases, namely synergy fac- 
tor r — 2.0, the cost of punishment 7 = 0.1, and the 
fine /? — 0.79 at system size L — 390. The upper three 
snapshots demonstrate that the system arrives to the O phase 
when the strategies are initially distributed randomly (snap- 
shots are given at t = 0, 100, and 1000 MCS). The middle 
panel demonstrates what happens if the initial state (left side) 
contains all the possible interfaces and vertices of homoge- 
neous domains of the three strategies (snapshots are given at 
t = 0, 200, and 3750 MCS). Here, the right plot illustrates the 
(Dh-Ch-O)c stationary state in which all the three strategies are 
present due to cyclic dominance. The bottom panel shows the 
competition of O and (D+C+0)c states where the latter is the 
winner (snapshots are given att — 0, 100, and 1600 MCS). As 
we have already mentioned, the most stable (D+C+0)c state 
can also emerge and spread from a random initial state, but 
only if the system size is large enough [in case of random ini- 
tial conditions the system size should exceed L — 1500 for 
these (r, 7, /3) parameter values to obtain a reliable solution]. 



III. RESULTS OF MONTE CARLO SIMULATIONS 

Systematic MC simulations are performed to reveal phase 
diagrams for three representative values of the multiplicative 
factor r. In each case we have determined the stationary fre- 
quencies of strategies when varying the fine P for many fixed 
values of cost 7 (/3, 7 > 0). The transition points and the type 
of phase transitions are identified from MC data collected with 
a sufficiently high accuracy (and frequency) in the close vicin- 
ity of the transition points. Finally, the phase boundaries, sep- 
arating different stable solutions, are plotted in the full fine- 
cost phase diagrams. 

The three values of r give rise to fundamentally different 
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FIG. 2: (Color online) Strategy frequencies v^. fine /3 for the punish- 
ment cost 7 = 0.01 at r = 2.0 and K = 0.5. 



behavior when increasing the value of fine. In the first two 
cases (r = 2 and 3.5) the cooperators die out in the absence 
of pool-punishment 03511 • The third value of the synergy fac- 
tor (r = 3.8) is chosen to illustrate the impact of punishment 
when C and D coexist in the absence of O. Evidently, the con- 
sideration of punishment becomes futile when the cooperators 
beat defectors in the absence of punishers (r > rc2)- When r 
increases towards rc2 the effect of punishment decreases with 
Pd (within the DC phase). The obtained quantitative results 
are discussed in detail in the following three subsections. 



A. Results for the synergy factor r = 2.0 

First, we illustrate the variation of strategy frequencies and 
also the phase transitions obtained by means of MC simula- 
tions as a function of fine for a low value of cost. Figure |2] 
shows consecutive transitions from the pure D phase to the 
final (D+C+0)c phase described above. 

When increasing the fine /3 at a low value of cost (7 — 0.01) 
one can observe three continuous phase transitions. First, the 
homogeneous defector state (D) transforms into the coexis- 
tence of defectors and pool-punishers (DO). In this phase, 
pool-punishers form compact clusters to survive in the sea 
of defectors. This mechanism is identical to the previously 
identified network reciprocity that enables pure cooperators 
to coexist with defectors jssll . Cooperators who refuse to bear 
the cost of punishment, however, are unable to survive due 
to the low value of r. Within the DO phase the frequency of 
pool-punishers increases continuously until the homogeneous 
O phase is reached. Surprisingly, further increasing /3 induces 
an additional phase transformation from the O phase into the 
(Dh-Ch-O)c phase, where the self-organizing pattern is main- 
tained by cyclic dominance and the nonzero frequencies for 
all strategies remain valid in the large fine limit. 

Within the (Dh-Ch-O)c phase po decreases monotonously 
with /3 in agreement with the anomalous behavior referred fre- 
quently as the "survival of the weakest" [46]. In the present 
case, the increase of fine reduces the income of the punished 
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FIG. 3: (Color online) Typical distribution of strategies on a 400 x 
400 portion of a larger square lattice for r — 2.0, /? — 0.01 and 
7 = 1.0. The color code is the same as used in Fig[T]and for symbols 
in Fig. ID 



defectors, which allows pure cooperators to survive. The lat- 
ter strategy behaves as the "predator" of pool-punishers, re- 
sulting in the decay of po despite of the increasing fine. The 
same cyclic dominance mediated complex interaction is able 
to increase po when 7 is increased (in this case the less effec- 
tive punishment does not allow C players, who are the "prey" 
of D, to survive). Similar effects were already reported in 
several three-strategy models, including the simpler spatial 
rock-paper-scissors game, and the main features were justi- 
fied by mean-field approximations and pair-approximations 
(for a brief survey see the review [40] and further reference s 
therein). The robustness of this behavior can be demonstrated 
effectively by a snapshot (see Fig. [3]), illustrating significantly 
different interfaces between the coexisting phases. 

At such a low punishment cost the cooperators can invade 
the sites of pool-punishers, albeit very slowly and only within 
the territories they have in common. It is emphasized that 
within these two-strategy tenitories in the 7 — ^ limit the 
strategy evolution reproduces the behavior of the voter model 
with equivalent strategies exhibiting rough interfaces and ex- 
tremely slow coarsening |47]. For low but finite values of 7 
the two-strategy system evolves slowly towards the homoge- 
neous C state while the interfaces remain irregular as demon- 
strated in Fig. [3] Notice that the interfaces separating the do- 
mains of defectors from cooperators or defectors from pool- 
punisher are less iiTegular, thus signaling the more obvious 
dominance between these strategy pairs. 

The increase of the punishment cost 7 reduces the net in- 
come of pool-punishers, consequently yielding fundamentally 
different variations in the strategy frequencies upon increasing 
of the fine /?, as demonstrated in Figs.|4] The upper plot illus- 
trates the disappearance of the pure O phase if 7 = 0.1. The 
extension (along /3) of the DO phase decreases linearly with 
7 and vanishes at 7 = 0.212. At the same time, the homo- 
geneous O phase can also be observed between the phases D 
and (D+C+0)c, but only if the cost exceeds a threshold value 
[here 7 > jthiir = 2) = 0.113] that depends also on the 



1.0 

0.8 

D 

'u 

§ 0.6 

(T 

tu 

i: 

M 0.4 
0.2 
0.0 



— 1 r- 



Pd 
Pc 
Po 



_l 1_ 



0.6 



0.7 



0.8 
fine 



0.9 




FIG. 4: (Color online) Strategy frequencies vs. fine /3 for the pun- 
ishment cost 7 = 0.1 (top) and 7 = 0.2 (bottom) at r — 2.0 and 
K = 0.5. 



multiplication factor r. 

Notice that for both values of the punishment cost 7 the 
(Dh-Ch-DO)c phase occurs via a first-order (discontinuous) 
phase transition when /3 increases, as can be infeiTed from the 
two panels of Fig.|4] Furthermore, the transition from D — s- O 
also becomes discontinuous in the absence of the DO phase. 

The above numerical investigations were repeated for many 
other values of 7, and the results are summarized in the full 
fine-cost phase diagram presented in Fig. |5] where the lower 
plot magnifies the most complex (small-cost) region. The 
lower (magnified) phase diagram refers to an additional new 
phase [(Dh-Ch-DO)c marked with an arrow], which we will, 
however, address in the following section because it has more 
obvious consequences at higher values of r. 

In general, the presented fine-cost phase diagram shows 
clearly that in this low-r region only defectors remain alive if 
the fine does not exceeds a threshold value that increases ap- 
proximately linearly with the cost of punishment. More pre- 
cisely, in the low noise limit the phase boundary separating 
the D and O phases approaches the straight line with a slope 
of 4/5, and this boundary moves left if r is increased. 
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FIG. 6: (Color online) Strategy frequencies vs. fine /3 for the punish- 
ment cost 7 = 0.01 at r = 3.5 and K — 0.5. 
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FIG. 7: (Color online) Strategy frequencies vs. fine /3 for the punish- 
ment cost 7 — 0.4 at r = 3.5 and K — 0.5. 



B. Results for the synergy factor r — 3.5 

Here for the low cost limit the system behavior is similar to 
the one described in the previous section. The relevant differ- 
ence is the absence of the O phase in the series of transitions 
upon increasing the fine /3, as demonstrated in Fig.|6] Notice 
that both transitions are continuous. The quantitative analy- 
sis supports the conjecture (4§\ that the continuous extinction 
of either the pool-punishers or the cooperators belongs to the 
directed percolation universality class. 

Figure |7] shows relevant differences in the fine-dependence 
of the strategy frequencies for higher values of 7. In this case 
our simulations indicate four intermediate phases between the 
phases D and (D+C+0)c if the fine is increased at a fixed cost 
and noise level. The five critical points for the consecutive 
transitions will be denoted as (3ci < . . . < /3c5- For example, 
the first transition at (3 = (3ci refers to a continuous transition 
from the phase D to DO in agreement with the cases discussed 
above. 

We first emphasize a striking novel feature in the fine- 



dependence of the strategy frequencies within the (D+C+0)c 
phase [/3 > (3c5{r = 0.4) = 1.30(1)]. In this case po ^ I 
(and evidently, both pc and po converge to zero) if /3 ap- 
proaches f3c5 from higher values in contrary to all the previ- 
ous cases demonstrated in Figs. |2] |4]and|6] This behavior is 
accompanied with a drastic change in the governing spatial 
patterns, as demonstrated in Fig. |8] 

Namely, in the close vicinity of the transition point smaller 
or larger islands of cooperators and/or pool-punishers are dis- 
persed in the see of defectors. Due to the cyclic dominance 
the islands of pool-punisher are blowing up while the cooper- 
ator islands are shrinking and disappear in most of the cases 
for the given r. The survival of cooperators is ensured by 
approaching a growing O island (such a situation is shown 
in the center of Fig. [8]l that is occupied quickly by the off- 
spring of the lucky cooperator The resultant cooperator is- 
land is attacked simultaneously by defectors whose success 
is enhanced by a guerilla-type warfare fragmenting the coop- 
erator's territory into a cluster of small shrinking islands as 
demonstrated in the same snapshot. This process is repeated 
forever if the defectors take pool-punishers off cooperator's 
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FIG. 8: (Color online) Typical distribution of strategies on a 400 x 
400 portion of a larger square lattice for r = 3.5, /3 = 0.2 and 
7 = 0.78. The color code is the same as used in Fig[T]and for 
symbols in Figs.|6]and|7] 



bands with a sufficiently high probability. 

The above mentioned evolutionary process implies the risk 
of extinction for the strategies that occur with only a low fre- 
quency. Within the region /3c4 < /3 < /3c5 the fixation of 
pool-punisher will occur if the cooperators die out first during 
the stochastic extinction process. In the opposite case, if pool- 
punishers go extinct first, the cooperators have no chance to 
survive. Consequently, within this parameter region the sys- 
tem can evolve towards one of the homogeneous states where 
only defectors or pool-punishers are present. Henceforth, the 
corresponding behavior (phase) will be denoted by F, referring 
to fixation. 

During the transient process the strategy frequencies oscil- 
late with a growing amplitude and oscillation period, and si- 
multaneously, the trajectory spirals out on the simplex ll49ll . 
Similar phenomena were already reported for many other sys- 
tems (for references see |T,'40','49'|). 

Figure |7] shows that the F phase is surrounded by phases 
exhibiting opposite behaviors when approaching the region of 
fixation. When approaching from the left hand side the system 
tends towards the O phase while from the opposite side the 
emergence of the D phase is favored. 

Another novel feature is the appearance of an additional 
three-strategy phase within a narrow range of /3 values 
(namely, ^c2 < /? < ^c3 where ^^^{r = 0.4) = 0.607(1) 
and /3c3(»' — 0.4) — 0.660(1)). This phase will be denoted as 
(Dh-Ch-DO)c because the corresponding snapshot (see Fig. |9) 
illustrates clearly that here the cyclic invasions occur between 
the D, the C, and the DO phase. This spatiotemporal structure 
can be reproduced very rarely because of the fast extinction of 
cooperators if the system is started from a random initial state 
even for L > 5000. In such a case the system evolves into the 
DO phase that is, however, unstable against the invasion of a 
cooperator block with a sufficiently large size (e.g. 10 x 10). It 
is worth mentioning that the resultant (Dh-Ch-DO)c phase will 
appear only after a long transient process. 

The cyclic dominance of alliances has already been ob- 




FIG. 9: (Color online) Typical distribution of strategies within the 
(D+C+DO)c phase on a 400 x 400 portion of a larger square lattice 
(L = 2000) for r = 3.5, p = 0.2 and 7 = 0.5. The color code is 
the same as used in Fig. [8] 

served in spatial ecological models [50]. The present model, 
however, offers an interesting new example when one strategy 
(one species) fights continuously against a group of strategies 
(species), resulting in a stable stationary solution. Notice, fur- 
thermore, that the effective invasion rates between the three 
phases are strongly influenced by the composition and the 
spatiotemporal structure of the DO phase. This is one of the 
reasons why the fine-dependence of strategy frequencies devi- 
ates from the standard behavior discussed for the (D+C+0)c 
phase. The other reason is related to the effect of the pattern 
topology itself. 

The MC simulations have confirmed clearly that the transi- 
tion (at /3 = /3c2) from DO to (Dh-Ch-DO)c is continuous while 
the subsequent transition (at f3 — /3c3) is a weakly first-order 
one. The latter behavior might be related to the different time 
scales (characterizing the average formation and lifetimes of 
the competing phases) that depended on (3 and 7 as well as on 
the synergy factor r. 

As for the r — 2.0 case, for r = 3.5 the effects of differ- 
ent values of the punishment cost 7 on the stationary states 
are also studied systematically by means of MC simulations, 
and the results are summarized in the full fine-cost phase dia- 
gram presented in Fig. [TO] As in the previous phase diagram 
(Fig. Is), here the dotted line [separating the territory of the 
(Dh-Ch-DO)c and (Dh-Ch-O)c phases] indicates the transition 
from phase DO to O in the absence of cooperators. Notice that 
this dotted line is the analytic continuation of the phase bound- 
ary separating the tenitory of the DO phase and the fixation 
phase. In fact, the transition from (Dh-Ch-DO)c to (Dh-Ch-O)c 
is made smoother by the absence of long thermaUzation (be- 
tween the domains of D and DO) that is due to the cyclic dom- 
inance emerging in the presence of cooperators. As a result, 
the transition can not be clearly identified by exclusively con- 
sidering the frequencies of strategies and/or nearest-neighbor 
pair coiTelations. The difference between these phases, how- 
ever, is well recognizable visually in the snapshots (compare 
Figs. |8] and |9]l. The same arguments are valid in the case of 
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FIG. 10: Full fine-cost phase diagram for r = 3.5 and K — 0.5. 
Solid (dashed) lines indicate continuous (discontinuous) phase tran- 
sitions. The dotted line represents the analytic continuation of the 
phase boundary separating the pure D and O phases in the absence 
of cooperators (C). 



r = 2.0 above, where the (Dh-Ch-DO)c phase was also men- 
tioned (see the bottom panel of Fig.|5]l. 

Despite the many striking differences, there also exist some 
qualitative similarities between the fine-cost phase diagrams 
obtained for r — 2 and 3.5. Namely, for both cases only de- 
fectors remain alive if the cost 7 exceeds a threshold value 
(which in both cases increases fairly linearly with the increas- 
ing of the fine (3). Furthermore, the system evolves into the 
(Dh-Ch-DO)c phase for sufficiently high fines if the cost is less 
than another fine-dependent threshold value. Upon decreasing 
of the fine a smooth transition from (D+C+0)c to (Dh-Ch-DO)c 
occurs when the coexistence of the D and O strategies is 
favored in the corresponding two-strategy (sub)system. The 
comparison of the two phase diagrams (Figs. ISlandfTOt illus- 
trates how the (Dh-Ch-DO)c phase expands together with the 
DO phase when r increases. 

The most relevant difference between the two phase dia- 
grams is represented by the fixation allowing the formation of 
either the pure O or pure D phase for r = 3.5, while only the 
pure O phase can occur for r = 2.0. The mentioned differ- 
ence implies significant deviation in the variation of strategy 
frequencies if the cost is increased for a sufficiently large fine. 
Namely, pc and po vanish continuously when approaching 
the boundary of fixation for r — 3.5, while the (Dh-Ch-O)c 
phase transform into the O phase via a first-order (discontin- 
uous) transition if ?- ~ 2.0. The corresponding territories (F 
and O phases on the fine-cost parameter plane) separate the D 
and (D+C+0)c phases. 



C. Results for the synergy factor r = 3.8 

In contrast with the values of the synergy factor r consid- 
ered above, here the magnitude is large-enough so that even in 
the absence of pool-punishers pure cooperators can survive in 
the presence of defectors. Accordingly, the enhancement of r 
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FIG. 1 1 : (Color online) Strategy frequencies vs. fine /3 for the pun- 
ishment cost 7 = 0.05 (top) and 7 = 0.1 (bottom) at r = 3.8 and 
K = 0.5. 



reduces the temptation to choose defection in the PGG, which 
ultimately yields a continuous increase in the cooperator fre- 
quency for the two-strategy {D and C) system. For the present 
interaction graph (the square lattice) r > rthi — 3.744. In 
this situation the efficiency of punishment decreases alongside 
with the frequency of defectors who are negatively affected by 
the sanctions. We therefore consider the efficiency of institu- 
tionalized punishment in the case when po/pc — 2 in the 
absence of pool-punishers (po ~ 0). For this value of po/pc 
we can extract quantitative results giving a sufficiently accu- 
rate and general picture about the impact of pool-punishment. 

First, we demonstrate the variations in the strategy frequen- 
cies upon increasing the fine (3 for two (low) values of the 
cost 7 characterizing the relevant processes. Figure [TTIshows 
that the pool punishers disappear (po — 0) if the fine does 
not exceeds a threshold value increasing with the cost 7. Evi- 
dently, Pd and pc remain unchanged in the absence of pool- 
punishers. 

For both low values of the cost there exist a region of fine 
where cooperators die out and pool-punishers maintain coop- 
eration at a level that increases with the fine. 

The replacement of the DC phase by the DO phase happens 
via a first-order (discontinuous) phase transition, which is a 
manifestation of a more general phenomenon. Namely, the 
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Solid (dashed) lines indicate continuous (discontinuous) phase tran- 
sitions. The dotted line represents the analytic continuation of the 
phase boundary separating the pure D and O phases in the absence 
of cooperators (C). 



cooperative C and O strategies fight separately against defec- 
tors (D players) and the final output depends on the success of 
these D — C and D ~ O struggles. Accordingly, the "indirect 
territorial battle" between O and C strategies will determine 
the final output of the game. This mechanism has already been 
observed for peer-punishment iIztIi and in the spatial public 
goods game with reward fsT]. 

Paradoxically, the further increase of punishment fine /3 en- 
hances the chance of cooperators to survive. Consequently, 
the system behavior transforms into the cyclic-dominance- 
governed (Dh-Ch-DO)c phase. For sufficiently high values of 
the fine /3, however, the defectors are not capable to survive 
within the domains of pool-punishers, and accordingly, the 
(Dh-Ch-DO)c phase evolves into the (Dh-Ch-O)c phase as de- 
scribed above. On the contrary, for higher values of the pun- 
ishment cost 7, the (Dh-Ch-DO)c phase transforms into the DC 
phase via a continuous transition, as illustrated in the lower 
panel of Fig. [TT] The MC simulations indicate that within a 
naiTow range of cost values the (D+C+0)c phase can occur 
and vanish again continuously if the fine is further increased. 

From systematic numerical investigations for several dif- 
ferent values of 7, we obtain the full fine-cost phase diagram 
that is presented in Fig. [12] In agreement with our expecta- 
tions, the pool-punishers cannot support the cooperation (and 
die out) if the cost of punishment exceeds a threshold value. 
The latter, of course, depends on the punishment fine /3. In 
other words, at such high values of r and 7 the pool-punishers 
become inefficient in their main task of facilitating the evo- 
lution of cooperation. This behavior is in sharp contrast with 
the impact of peer-punishment, which can always eliminate 
defectors for sufficiently high values of fine independently 
on the value of r. In case of pool-punishment the most ef- 
ficient suppression of defection is achieved at the right border 
of the DO phase that is formed only within a limited range 
of fine values. As we have emphasized, for higher fines the 
self-organizing patterns with cyclic invasion of D, C and DO 



domains (the latter domains are replaced by D domains if the 
DO phase becomes unstable) emerge, which in case of the sta- 
ble (Dh-Ch-DO)c phase manifest a new form of cyclical dom- 
inance that forms not just between individual strategies but 
also between strategy alliances. 



IV. SUMMARY 

The impact of pool-punishment was studied in a spatial 
public goods games with cooperators, defectors and pool- 
punishers as the three competing strategies. In particular, the 
efficiency of pool-punishment in maintaining socially advan- 
tageous states was contrasted with that of peer-punishment 
II26I - I28II . For easier comparisons, in both cases the players 
were located on the sites of a square lattice, the collected in- 
come resulted from five five-person public good games, and 
the strategy evolution was governed by the same stochastic 
imitation rule. Monte Carlo simulations, performed for dif- 
ferent combinations of the fine and cost of punishment at 
three typical values of the multiplication factor, reveal rele- 
vant differences if compared with previous results where the 
peer-punishers were able to dominate if the fine exceeded a 
threshold value that increased with the cost. Here, on the con- 
trary, the institutional sanctions are less effective because co- 
operators always invade the territories of pool-punishers , even 
for marginally positive values of the punishment cost. On the 
other hand, in contrast to the well-mixed case, to maintain 
pool-punishment is generally viable without the necessity of 
sanctioning the second-order free-riders [59ll . 

It turns out that the pool-punishers can dominate the system 
only within a strongly limited cooperator-unfriendly region of 
parameters. Meanwhile, for high fines the system paradox- 
ically evolves into a self-organizing spatiotemporal pattern 
where the rock-paper-scissors type cyclic dominance helps 
the coexistence of all three strategies. In fact, we could dis- 
tinguish two different cyclic phases, namely, (Dh-Ch-O)c and 
(Dh-Ch-DO)c. The latter phase represents a new type of cyclic 
dominance when single strategies (D and C) fight against an 
alliance (D + O). The possibiUty of these two phases gov- 
erned by cyclic dominance is accompanied with an unusual 
sensitivity to the topological features of the self-organizing 
patterns. As a result, in some cases we have observed the fix- 
ation to either the homogeneous defector or the homogeneous 
pool-punisher phase. Based on earlier works examining the 
spatial PGG and its variants, the reported impact of pool pun- 
ishment is expected to be robust against using different inter- 
action graphs and group sizes. 

The accurate determination of presented phase diagrams 
(for the large size limit) required a careful stability analysis 
based on the concept of competing associations 14011 . In the 
light of our results we can strongly recommend the application 
of this method for other multi-strategy systems where a com- 
plex phase diagram is likely to be encountered. A potential 
example is given by the present evolutionary PGG with four 
rather than three strategies (besides cooperators, defectors 
and pool-punishers containing also peer-punishers), which we 
wish to study in the near future. 
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